Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA
نویسندگان
چکیده
The biological implications of bioinformatics can already be seen in various implementations. Biological taxonomy may seem like a simple science in which the biologists merely observe similarities among organisms and construct classifications according to those similarities, but it is not so simple. By applying data mining techniques on gene sequence database we can cluster the data to find interesting similarities in the gene expression data. One of the applications of such kind of clustering is taxonomically clustering the organisms based on their gene sequential expressions. In this study we outlined a method for taxonomical clustering of species of the organisms based on the genetic profile using Principal Component Analysis and Self Organizing Neural Networks. We have implemented the idea using Matlab and tried to cluster the gene sequences taken from PAUP version of the ML5/ML6 database. The taxa used for some of the basidiomycetous fungi form the database. To study the scalability issues another large gene sequence database was used. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained were more significant and promising. The proposed method clustered the species of organisms correctly in almost all the cases. The obtained results were more significant and promising.
منابع مشابه
Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information
Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...
متن کاملData Mining for Identification of Forkhead Box O (FOXO3a) in Different Organisms Using Nucleotide and Tandem Repeat Sequences
Background: Deregulation of FOXO3a gene which belongs to Forkhead box O (FOXO) transcription factors, can cause cancer (e.g. breast cancer). FOXO factors have important role in ubiquitination, acetylation, de-acetylation, protein-protein interactions and phosphorylation. Understanding the regulation and mechanisms of FOXO3a can lead to cancer treatment. The aim of this study recent association...
متن کاملDNA Fingerprinting Based on Repetitive Sequences of Iranian Indigenous Lactobacilli Species by (GTG)5- REP-PCR
Background and Objective: The use of lactobacilli as probiotics requires the application of accurate and reliable methods for the detection and identification of bacteria at the strain level. Repetitive sequence-based polymerase chain reaction (rep-PCR), a DNA fingerprinting technique, has been successfully used as a powerful molecular typing method to determine taxonomic and phylogenetic relat...
متن کاملSignal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases
Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...
متن کاملDesigning Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method
Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005